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Abstract 

The aim of the manuscript is to characterize monotone 'metric' in the space of Markov map. Here, 
'metric' means the square of the norm defined on the tangent space, and not necessarily induced from 
an inner product (this property hereafter will be called inner-product-assumption), different from usual 
metric used in differential geometry. 

As for metrics in So far, there have been plenty of literatures on the metric in the space of probability 
distributions and quantum states. Among them, Cencov proved the monotone metric in probability 
distribution space is unique up to constant multiple, and identical to Fisher information metric. Petz 
characterized all the monotone metrics in the quantum state space using operator mean. As for channels, 
however, only a little had been known. 

In this paper, we impose monotonicity by concatenation of channels before and after the given channel 
families, and invariance by tensoring identity channels. (Notably, we do not use the inner-product- 
assumption. ) To obtain this result, 'resource conversion' technique, which is widely used in quantum 
information, is used. We consider distillation from and formation to a family of channels. Under these 
axioms, we identify the largest and the smallest 'metrics'. Interestingly, they are not induced from any 
inner product, i.e., not a metric. Indeed, one can prove that any 'metric' satisfying our axioms can not 
be a metric. 

This result has some impact on the axiomatic study of the monotone metric in the space of classical 
and quantum states, since both conventional theory relies on the inner-product-assumption. Also, we 
compute the lower and the upper bound for some concrete examples. 
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1 Introduction 



The aim of the manuscript is to characterize monotone 'metric' in the space of Markov map. Here, 'metric' 
means the square of the norm defined on the tangent space, and not necessarily induced from an inner 
product, different from usual metric used in differential geometry. 

So far, there have been plenty of literatures on the metric in the space of probability distributions and 
quantum states. Cencov, sometime in 1970s, proved the monotone metric in probability distribution space 
is unique up to constant multiple, and identical to Fisher information metric [1]. He also discussed invariant 
connections in the same space. Amari and others independently worked on the same objects, especially from 
differential geometrical view points, and applied to number of problems in mathematical statistics, learning 
theory, time series analysis, dynamical systems, control theory, and so on[T][5]. Quantum mechanical states 
are discussed in literatures such as [1] [3] [S] [S] [5] . Among them Petz [5] characterized all the monotone metrics 
in the quantum state space using operator mean. 

As for channels, however, only a little had been known. To my knowledge, there had been no study about 
axiomatic characterization of distance measures in the classical or quantum channel space. 

In this paper, we impose monotonicity by concatenation of channels before and after the given chan- 
nel families, and invariance by tensoring identity channels. (Notably, we do not use the inner-product- 
assumption. ) To obtain this result, 'resource conversion' technique, which is widely used in quantum 
information, is used. We consider distillation from and formation to a family of channels. 

Under these axioms, we identify the largest and the smallest 'metric'. Interestingly, they are not induced 
from any inner product, i.e., not a metric. Indeed, one can prove that any 'metric' satisfying our axioms 
can not be a metric. 

In author's opinion, the axiom in this manuscript is reasonable and minimal, and it is essential that 
being metric in narrow sense is not required. Hence, this result has some impact on the axiomatic study of 
the monotone metric in the space of classical and quantum states, since both Cencov [4J and Petz [5] relies 
on the inner-product-assumption. Since classical and quantum states can be viewed as channels with the 
constant output, it is preferable to dispense with the inner-product-assumption. This point will be discussed 
in a separate manuscript. 

2 Notations and conventions 

• ^in ('X'out) :the totality of the input (output) alphabet 

• T^in (^out) : the totality of the probability distributions over <^in (<8)out)- In this paper, the existence 
of density with respect to an underlying measure /i is always assumed. Hence, V-m ('Pout) is equivalent 



2 



to the totality of density functions. 

• C : the totality of channels which sends an element of 'Pin to an element of "Pout 

• Vk ■ totality of probability mass functions supported on {1, 2, • • • , k} 

• Ck,i ■ totality of the Markov map from Pk to Pi 

• x,y, etc.: an element of 0in ,0out 

• X,Y, etc.: random variable taking values in (g)in ,®out 

• A probability distribution p is identified with the Markov map which sends all the input probability 
distributions to p. (Hence represented by a transition matrix of rank 1.) 

• T(-): tangent space 

• S etc. : an element of Tp {Pin) etc. 

• A etc. : an element of Ti (C) 

• An element ^ of 7^ (^in) etc. is identified with an element / of such that J /d/x = 0. 

• 9p{^)- square of a norm in Tp {Pk) 

• G$ (A): square of a norm in 7i> {Ck,i) 

• Jp {5) : classical Fisher information 

• The local data at p: the pair {p, 5}. 

• The local data at $ : the pair {$, A}. 

• $ {■\x) G Pout : the distribution of the output alphabet when the input is x 

• A ('la;) G 7^ ('Pout) is defined as the infinitesimal increment of above 

• I: identity 

3 Axioms 

(Ml) G$ (A) > G4,o* (A o 
(M2) G$ (A) > Gvpo* (* o A) 
(E) G$^i (A o I) = G$ (A) 
(N) Gp{5)=gp{5) 
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4 Programming or simulation of channel families 

Suppose we have to fabricate a channel $51, which is drawn from a family without knowing the value of 

9 but with a probability distribution qg or a channel ^'e, drawn from a family {qe} or More specifically, 

we need a channel A with 

$e = Ao (I(g)qe) , (1) 

or channels A^ and A^ with 

$e = Abo(*e®I)oAa. (2) 

Here, note that A, Aa, and A;, should not vary with the parameter 9. Note also that the former is a special 
case of the latter. Also, giving the value of 9 with infinite precision corresponds to the case oi qg — 5 {x — 9). 
Differentiating the both ends of (H]) and ©, and letting $e — qe — q, and = ^I^, we obtain 

A = Ao{I^S), (3) 

and 

A = Abo(A'0l)oA„ (4) 

where A e (Ckj), 5 eTq {Vk'), and A' e T* {Ck'j'). 

In the manuscript, we consider tangent simulation, or the operations satisfying ([T]) (or ^ ) and ([3]) (or 
resp.), at the point ^g — ^ only. Especially, we are interested in point simulation of the 1-dimensional 
subfamily. Note that simulation of {$, A} is equivalent to the one of the channel family {$e+t = '5 + iA}^. 

5 Relation between g and G 

In this section, we study norms with (Ml), (M2), (E), and (N). 
Theorem 1 Suppose (Ml) and (N) hold. Then, 

(A) > G'™" (A) sup 5$(p) (A (p)) - max (A O)) . 

Also, G^'"(A) satisfies (Ml), (M2), (E), and (N). 
Proof. 

G$ (A) = G$ (A) > G^op (A o p) = (A (p)) . 

The last identity is trivial. Obviously, G|?'" (A) satisfies (Ml), (M2) and (N). (E) is seen from the right most 
side expression. ■ 
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Theorem 2 Suppose (M2), (E) and (N) hold. Then 

G$ (A) < G^^ (A) := inf {g, (,5) ; A o (I ® 9) = A o (I ® 5) = A } . 

A,q,5 

Also, G-^''''{A) satisfies (Ml), (M2), (E), and (N). 
Proof. 

Qq (<5) = Gq {5) = Gl^q (I (g) 5) > GAo(I®g) (A O (I (g) S)) 

= (A) . 

So we have the inequahty. That Gf'^^ (A) satisfies (Ml), (M2), (E), and (N) is triviaL ■ 
Corollary 3 

G^^ (A) > G^^" (A) . 

Obviously, G|^™ (A) and G^'^'^ (A) arc not induced from any metric, i.e., they cannot be written as 
S (A, A), where 5 is a positive real bilinear form. Indeed, we can show the following theorem: 

Theorem 4 Suppose (Ml), (M2), (E) and (N) hold. For any interior point $ of €2,2, G$ (A) cannot 
written as 5$ (A, A), where 5$ is a positive real bilinear form. 

Proof. Let $ be the one which corresponds to the stochastic matrix 

1-t s 
t 1-s 

Also, let 
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1 


Ai := 




, A2 := 






-1 




-1 



Since the family {$ + 0Ai}e can be simulated by the simulation suggested by the decomposition 

$ + ^Ai = (1 - i + ($ + tAi) + {t-e){^-{l- t) Ai) , 

(M2) and (E), wc have G$ (Ai) < gp (J), where p = (1 — t, t) and 5 — (1,-1). On the other hand, by chosing 
input as (1,0), Ai} induces {p,5}. Therefore, by (Ml), G$ (Ai) > gp [5) and hence 

G$ (Ai)=5p(5). 

Similarly, we have 

G$ (A2) = gq {5') , 
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where q = (s, 1-s) and 6' = (1,-1). Consider the family {^+t{Ai + aA2)}t. If \a\ < min |, i^f, 

this can be generated by the simulation suggested by 

$ + i (Ai + 0A2) = {l-t + e)i^ + tAi+ taA2) + - 6») ($ - (1 - Ai - a (1 - t) A2) . 

Therefore, (Ai + aA2) < Qp (5). On the other hand, by chosing input as (1,0), Ai + aA2} induces 
{p,5}. Therefore, 

G$ (Ai + aA2) = gp {5) . 
On the other hand, if G$ (A) = 5$ (A, A) with some linear bilinear form 5$, 

G4, (Ai + aA2) Sq, (Ai + aAa, Ai + aAa) 

= (Ai, Ai) + a^S^ (A2, A2) + 2aS^ (Ai, A2) 
= gp{5) + a''g^{5')+2aS{Ai,A2)- 

Hence, it should hold that 

a^g^ (5') + 2a5$(Ai,A2) = 

for any \a\ < min |, jE^, T^}' 9<i (^') = ^- Since 5 ^Q, this is contradiction. ■ 

Observe that the argument parallel with the above proof applies also to Ck,i {k,l > 3). The following 
property isuseful in computation of G™^. 

Proposition 5 Let {T^'^}"_^ be the extreme points ofC. Then 

Gr"(A) =min5g(^) 

q,S 

where 9 = (51, • • • , qn) is a probability distribution over {T^*^} with 

n 

and S = {Si,- - ■ , 5„) satisfies A = X^-L^ SiT^^\ 

Proof. Consider a simulation suggested by the decomposition 

<P = j *dP(*), ^ = j */dP(*), 

where P is a probability measure over C and / fdP (^) = 0. Here the 'program' is {P, / o P}, where / o P 
is the singed measure defined by / o P{A) = /^/dP(^). Letting ^ = obtain another 

simulation corresponding to the decomposition 



where 

Qi ■■= J Pii^dP{^),Si := J ft|*/dP(vl/). 

Here the 'program' is the pair {q, 6}. The following Markov map sends the pair {P,foP} to the pair 
{q, 5}: upon accepting ^, which is generated according to the probability measure P, generate T^'^ with 
the probability Therefore, by monotonicity, 

9P {foP)> g, {6) , 

which implies the assertion. ■ 

6 Binary channels 62,2 

In this section, we suppose g is the Fisher information metric. €2,2 has four extreme points. 



X(i) 


1 


, T(2) := 





, T(3) := 


1 


, T^'') := 


1 1 




1 


1 1 




1 






and can be parameterized as 

1-t s 
t l-s 

Hence the space can be viewed as a square. Consider onc-paramctcr subfamily {$9} of C2,2. passing through 
$. Let A and b the intersection of the edge of €2,2 and the tangent line at $ with the tangent A. 
Obviously, {$, A} can be simulated as a probabilistic mixture of ^ a and ^b- Hence, defining a and b by 
A = a {'^A -'^b) and $ = h'^ a + (1 - 6) 

a? 

G^-(A)<- + ^. 

Suppose ^ A and b can bo discriminated with certainty by observing the output for a properly chosen 
input. This occurs if and only if one of the following is true: 





n 


= 1 


& 




01 


= 1, 




01 


= 1 


& 




11 


= 1, 




10 


= 1 


& 




00 


= 1, 




00 


= f 


& 




10 


= 1. 



In such cases, one can extract the Fisher information of the binary distribution which is used to mix ^a and 
^B- Therefore, 

Gr(A)>y + 3^. 
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Hence, due to CorollarylSl we have 



(A) = Gg"^ (A) = Gr'^ (A) = ^ + 

1—0 



Especially, if $ = i 



1 1 
1 1 



this is the case for any A. 
In general, however, the simulation by the mixture of and ^'s is not optimal. For example, let 

$ := aT^i) + 5T(2) + cT^^^ ^ (a - t) T^^^ + (6 + i) T^^^ + (c - t) T^^'' + tT^*^ 



a c 
b + c a + b 



a c 
1 — a 1 — c 



- t(^) (1 - s) T(^) + s (^T(2) + T^'') - T^^)^ - T^^) 



-1 1 
1 -1 

= - (1 + s) T(i) + sT(2) + (1 - s) T(3) + sT(4), 



with 



a + b + c=l, 0<t<l, se 



We use Proposition^ 



GT"" (A) = min 

i(E [0,inin{a,c}] 



(1 + s)' s2 



a — t b + t c — t t 
First, we optimize over s, which achieves minimum at 

{a-c)t{t + b) 



-t^ + 2act + abc 



Hence, 



G^"'' (A) = min 



2t + ab + be 



te[o,min{a,c}] —t"^ + 2act + abc 



2t + ab + be 



te[o,min{Q,c}] ((ac + Va^c^ + abe) - t) (t - (ac - Va^c^ + abe)) 
After some computation, one can verify 



ae 



+ \J a^c^ + abe = ac + \/ a^c^ + ac (1 — a — c) < min {a, c} . 



Therefore, the function to be optimized is monotone increasing in the domain. Hence, the minimum is 
achieved at i = 0. Therefore, 



GT^ (A) 



a + c 1 1 



ae a e 



8 



Note that the optimal simulation uses three extreme points, T^^^ T(^\ and T^^'. It is not difficult to 
compute 



Since a + c < 1, ^ (A) > Gg"" (A). ("=" holds if and only if a + c = 1.) 
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